Bayesian Statistics

Table of Contents

1. Principle of Maximum Entropy

The probability distribution which best represents the current state of knowledge about a system is the one with largest entropy.

1.1. Principle of Indifference

In the absence of any evidence, the credence—the degree of belief—should be equally distributed among all possible outcomes.

2. Prior Probability

  • Probability distribution before taking the evidences into account.

2.1. Strength

  • The certainty upon the system. Strong prior would change little.

3. Bayes' Theorem

  • Bayes' Law, Bayes' Rule

3.1. Statement

  • \[ \operatorname{P}[A|B] = \frac{\operatorname{P}[B|A]\operatorname{P}[A]}{\operatorname{P}[B]} \] where \(\operatorname{P}\) is the probability of the events \(A\) and \(B\).
  • According to the Bayesian probability interpretation:
    • \(\operatorname{P}[A|B]\) is the posterior probability of \(A\) given \(B\).
    • \(\operatorname{P}[B|A]\) is the likelihood of \(A\) given a fixed \(B\), since \(\operatorname{P}[B|A] = \operatorname{L}[A|B]\).
    • \(\operatorname{P}[A]\) is the prior probability.
    • \(\operatorname{P}[B]\) is the marginal probability.

4. Conjugate Distribution

If prior distribution and the posterior distribution is in the same probability distribution family, then the prior and posterior are called conjugate distributions, and the prior is called a conjugate prior for the likelihood function.

5. Bayes Estimator

  • Bayes Action

Estimator or decision rule that minimizes the posterior expected value of a loss function.

6. Maximum A Posteriori Probability Estimator

  • MAP Estimator

6.1. Description

The maximum likelihood estimate of \(\theta\): \[ \hat{\theta}_{\rm MLE}(x) = \operatorname*{arg\ max}_{\theta} f(x\mid\theta) \] can be generalized to include the prior distribution \(g(\theta)\) using Bayes' theorem:

\begin{align*} \hat{\theta}_{\rm MAP}(x) &= \operatorname*{arg\ max}_{\theta}\frac{f(x\mid \theta)g(\theta)}{\int_\Theta f(x\mid\vartheta)g(\vartheta)d\vartheta} \\[10pt] &= \operatorname*{arg\ max}_{\theta} f(x\mid \theta)g(\theta). \end{align*}

7. References

Created: 2025-06-18 Wed 02:30